Introduction

Column

Tempo Distribution differences

Energy vs Valence

Column {data-width= 400}

Information about the corpus and some graphs

Corpus information

The corpora I chose are playlists from Spotify. These are playlists I really often listen to when I am studying or travelling for example. It consists of a lot of different kind of music and everytime it has something I really enjoy listening to. The playlists are the Spotify playlist called “ALL OUT OF 2010s” and “ALL OUT OF 2000s”. I think there is alot of different songs/artists/genres etc. There are a few artists who have released very big and many songs and appear more often in this list so that will also lead to interesting results. The artists are mainly Adele, Justin Bieber and Shawn Mendes

What I am eager about is, is to find new corelations or insights that you can not see or hear when you are listening to the playlist. The 2010s saw the emergence of new genres, such as trap and EDM, as well as the continued popularity of established genres like pop, rock, and hip-hop. This diversity of musical styles offers a rich and varied corpus for analysis and exploration.

Additionally, the 2010s marked a time of significant social and cultural change, with movements like Black Lives Matter and Me Too having a major impact on the music industry and the messages conveyed through music. This could make the corpus of music from the 2010s particularly interesting for researchers interested in studying the intersection of music and social movements.

Outliers

Chromagrams

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the chromagrams

In the chromagrams we see clearly which notes occur most in the two songs, what is immediately noticeable is that there is a big difference between the notes of the two songs, the song of Eminem uses mainly the low notes which is more expected in rap and the song Mirrors uses mainly the higher (happier) notes. Mirrors also uses more minor notes than Stan.

Ceptograms

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the ceptograms

As can be seen, there is not really a very big and obvious difference between the two plots. What is noticeable is that c02 and c03 is very high in the beginning with Stan and in the end which makes sense because that has more pace than the middle section and it also ends with more pace again. With Mirrors, c01 through c03 is quite present, which is quite expected because the song has a very similar tempo and mostly the same notes as well.

Self-Similarity Matrices

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the SSM

We can see with both Mirrors and Stan exactly when almost most of the instruments are used at once. Which is an interesting occurrence because with Stan you hear this quite clearly at the beginning but with Mirrors it is more difficult to figure out exactly when this happens. This is probably because the song uses more instruments than Stan anyway.

Chordograms

Column

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the chordograms

There is an obvious difference in these plots. And also confirms expectations. Mirrors is using a lot more notes (keys) than Stan. And there is a moment when everything comes at once in Stan which is quite obvious in the plot (around 300 s). This also shows a difference between rap and more cheerful song with higher danceability. What also leads from Stan’s plot is that the song is mostly in Major notes, which was also confirmed earlier.

Tempograms/Dendrograms

Column

Dendrogram clustering

Mirrors by Justin Timberlake

Stan by Eminem

Column {data-width= 400}

Text about the dendrogram and tempograms

Here we have the hierarchical cluster of the 2010 corpus. I used the “complete” method in the code otherwise the dendrogram was much more inefficient (it isn’t best now too but I can not seem to fix that..).

I also have a problem with the tempograms of the songs, because if I load the code separately the code works fine, but when I knit the project it keeps in a loop and eventually stops knitting… I really don’t know how to fix that. I do have the code in the comments in de .Rmd file.

Conclusion and Thoughts

The first thing to conclude is that it is not easy with these corpora to see very big differences.

---
title: "Computational Musicology Dashboard 2023"
output: 
  flexdashboard::flex_dashboard:
    storyboard: true
    social: menu
    source: embed
    orientation: columns
---


```{r setup, include=FALSE}
library(flexdashboard)
library(tidymodels)
library(tibble)
library(ggdendro)
library(stringr)
library(purrr)
library(scales)
library(tidyr)
library(remotes)
library(spotifyr)
library(ggplot2)
library(dplyr)
library(compmus)

all_out_2010 <- get_playlist_audio_features("", "37i9dQZF1DX5Ejj0EkURtP")
all_out_2000 <- get_playlist_audio_features("", "37i9dQZF1DX4o1oenSJRJd")
```


Introduction 
================

Column {.tabset}
--------------------------------

### Tempo Distribution differences
```{r}
ggplot(height=400, width=500) +
  geom_density(data = all_out_2010, aes(x=tempo, fill="All Out 2010s"), alpha=0.5) +
  geom_density(data = all_out_2000, aes(x=tempo, fill="All Out 2000s"), alpha=0.5) +
  labs(title="Tempo Distribution", x="Tempo", y="Density") +
  scale_fill_manual(values=c("blue", "green"))
```

### Energy vs Valence
```{r}
ggplot(height=600, width=5600) +
  geom_point(data = all_out_2010, aes(x=energy, y=valence, color="All Out 2010s"), alpha=0.5) +
  geom_point(data = all_out_2000, aes(x=energy, y=valence, color="All Out 2000s"), alpha=0.5) +
  labs(title="Energy vs. Valence", x="Energy", y="Valence") +
  scale_color_manual(values=c("blue", "green"))
```

Column {data-width= 400}
----------------------

### Information about the corpus and some graphs

<h2>Corpus information</h2>
The corpora I chose are playlists from Spotify. These are playlists I really often listen to when I am studying or travelling for example. It consists of a lot of different kind of music and everytime it has something I really enjoy listening to. The playlists are the Spotify playlist called "ALL OUT OF 2010s" and "ALL OUT OF 2000s". I think there is alot of different songs/artists/genres etc. There are a few artists who have released very big and many songs and appear more often in this list so that will also lead to interesting results. The artists are mainly Adele, Justin Bieber and Shawn Mendes

What I am eager about is, is to find new corelations or insights that you can not see or hear when you are listening to the playlist. The 2010s saw the emergence of new genres, such as trap and EDM, as well as the continued popularity of established genres like pop, rock, and hip-hop. This diversity of musical styles offers a rich and varied corpus for analysis and exploration.

Additionally, the 2010s marked a time of significant social and cultural change, with movements like Black Lives Matter and Me Too having a major impact on the music industry and the messages conveyed through music. This could make the corpus of music from the 2010s particularly interesting for researchers interested in studying the intersection of music and social movements.

<h3>Outliers</h3>


Chromagrams
============================================

Column {.tabset}
--------------------------------

### Mirrors by Justin Timberlake

```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |>
  select(segments) |>
  unnest(segments) |>
  select(start, duration, pitches)

mirrors |>
  mutate(pitches = map(pitches, compmus_normalise, "euclidean")) |>
  compmus_gather_chroma() |> 
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = pitch_class,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  theme_minimal() +
  scale_fill_viridis_c()
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |>
  select(segments) |>
  unnest(segments) |>
  select(start, duration, pitches)

stan |>
  mutate(pitches = map(pitches, compmus_normalise, "euclidean")) |>
  compmus_gather_chroma() |> 
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = pitch_class,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  theme_minimal() +
  scale_fill_viridis_c()
```

Column {data-width= 400}
----------------------

### Text about the chromagrams
In the chromagrams we see clearly which notes occur most in the two songs, what is immediately noticeable is that there is a big difference between the notes of the two songs, the song of Eminem uses mainly the low notes which is more expected in rap and the song Mirrors uses mainly the higher (happier) notes. Mirrors also uses more minor notes than Stan.


Ceptograms
============================================

Column {.tabset}
--------------------------------

### Mirrors by Justin Timberlake
```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

mirrors |>
  compmus_gather_timbre() |>
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = basis,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  scale_fill_viridis_c() +                              
  theme_classic()
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

stan |>
  compmus_gather_timbre() |>
  ggplot(
    aes(
      x = start + duration / 2,
      width = duration,
      y = basis,
      fill = value
    )
  ) +
  geom_tile() +
  labs(x = "Time (s)", y = NULL, fill = "Magnitude") +
  scale_fill_viridis_c() +                              
  theme_classic()
```

Column {data-width= 400}
----------------------

### Text about the ceptograms
As can be seen, there is not really a very big and obvious difference between the two plots. What is noticeable is that c02 and c03 is very high in the beginning with Stan and in the end which makes sense because that has more pace than the middle section and it also ends with more pace again. With Mirrors, c01 through c03 is quite present, which is quite expected because the song has a very similar tempo and mostly the same notes as well.



Self-Similarity Matrices
==============================

Column {.tabset}
--------------------------------

### Mirrors by Justin Timberlake
```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

mirrors |>
  compmus_self_similarity(timbre, "cosine") |> 
  ggplot(
    aes(
      x = xstart + xduration / 2,
      width = xduration,
      y = ystart + yduration / 2,
      height = yduration,
      fill = d
    )
  ) +
  geom_tile() +
  coord_fixed() +
  scale_fill_viridis_c(guide = "none") +
  theme_classic() +
  labs(x = "", y = "")
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |> # Change URI.
  compmus_align(bars, segments) |>                     # Change `bars`
  select(bars) |>                                      #   in all three
  unnest(bars) |>                                      #   of these lines.
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  ) |>
  mutate(
    timbre =
      map(segments,
        compmus_summarise, timbre,
        method = "rms", norm = "euclidean"              # Change summary & norm.
      )
  )

stan |>
  compmus_self_similarity(timbre, "cosine") |> 
  ggplot(
    aes(
      x = xstart + xduration / 2,
      width = xduration,
      y = ystart + yduration / 2,
      height = yduration,
      fill = d
    )
  ) +
  geom_tile() +
  coord_fixed() +
  scale_fill_viridis_c(guide = "none") +
  theme_classic() +
  labs(x = "", y = "")
```

Column {data-width= 400}
----------------------

### Text about the SSM
We can see with both Mirrors and Stan exactly when almost most of the instruments are used at once. Which is an interesting occurrence because with Stan you hear this quite clearly at the beginning but with Mirrors it is more difficult to figure out exactly when this happens. This is probably because the song uses more instruments than Stan anyway.

Chordograms
=====================================

Column {.tabset}
--------------------------------

```{r}
circshift <- function(v, n) {
  if (n == 0) v else c(tail(v, n), head(v, -n))
}

#      C     C#    D     Eb    E     F     F#    G     Ab    A     Bb    B
major_chord <-
  c(   1,    0,    0,    0,    1,    0,    0,    1,    0,    0,    0,    0)
minor_chord <-
  c(   1,    0,    0,    1,    0,    0,    0,    1,    0,    0,    0,    0)
seventh_chord <-
  c(   1,    0,    0,    0,    1,    0,    0,    1,    0,    0,    1,    0)

major_key <-
  c(6.35, 2.23, 3.48, 2.33, 4.38, 4.09, 2.52, 5.19, 2.39, 3.66, 2.29, 2.88)
minor_key <-
  c(6.33, 2.68, 3.52, 5.38, 2.60, 3.53, 2.54, 4.75, 3.98, 2.69, 3.34, 3.17)

chord_templates <-
  tribble(
    ~name, ~template,
    "Gb:7", circshift(seventh_chord, 6),
    "Gb:maj", circshift(major_chord, 6),
    "Bb:min", circshift(minor_chord, 10),
    "Db:maj", circshift(major_chord, 1),
    "F:min", circshift(minor_chord, 5),
    "Ab:7", circshift(seventh_chord, 8),
    "Ab:maj", circshift(major_chord, 8),
    "C:min", circshift(minor_chord, 0),
    "Eb:7", circshift(seventh_chord, 3),
    "Eb:maj", circshift(major_chord, 3),
    "G:min", circshift(minor_chord, 7),
    "Bb:7", circshift(seventh_chord, 10),
    "Bb:maj", circshift(major_chord, 10),
    "D:min", circshift(minor_chord, 2),
    "F:7", circshift(seventh_chord, 5),
    "F:maj", circshift(major_chord, 5),
    "A:min", circshift(minor_chord, 9),
    "C:7", circshift(seventh_chord, 0),
    "C:maj", circshift(major_chord, 0),
    "E:min", circshift(minor_chord, 4),
    "G:7", circshift(seventh_chord, 7),
    "G:maj", circshift(major_chord, 7),
    "B:min", circshift(minor_chord, 11),
    "D:7", circshift(seventh_chord, 2),
    "D:maj", circshift(major_chord, 2),
    "F#:min", circshift(minor_chord, 6),
    "A:7", circshift(seventh_chord, 9),
    "A:maj", circshift(major_chord, 9),
    "C#:min", circshift(minor_chord, 1),
    "E:7", circshift(seventh_chord, 4),
    "E:maj", circshift(major_chord, 4),
    "G#:min", circshift(minor_chord, 8),
    "B:7", circshift(seventh_chord, 11),
    "B:maj", circshift(major_chord, 11),
    "D#:min", circshift(minor_chord, 3)
  )

key_templates <-
  tribble(
    ~name, ~template,
    "Gb:maj", circshift(major_key, 6),
    "Bb:min", circshift(minor_key, 10),
    "Db:maj", circshift(major_key, 1),
    "F:min", circshift(minor_key, 5),
    "Ab:maj", circshift(major_key, 8),
    "C:min", circshift(minor_key, 0),
    "Eb:maj", circshift(major_key, 3),
    "G:min", circshift(minor_key, 7),
    "Bb:maj", circshift(major_key, 10),
    "D:min", circshift(minor_key, 2),
    "F:maj", circshift(major_key, 5),
    "A:min", circshift(minor_key, 9),
    "C:maj", circshift(major_key, 0),
    "E:min", circshift(minor_key, 4),
    "G:maj", circshift(major_key, 7),
    "B:min", circshift(minor_key, 11),
    "D:maj", circshift(major_key, 2),
    "F#:min", circshift(minor_key, 6),
    "A:maj", circshift(major_key, 9),
    "C#:min", circshift(minor_key, 1),
    "E:maj", circshift(major_key, 4),
    "G#:min", circshift(minor_key, 8),
    "B:maj", circshift(major_key, 11),
    "D#:min", circshift(minor_key, 3)
  )
```

### Mirrors by Justin Timberlake
```{r}
mirrors <-
  get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV") |>
  compmus_align(sections, segments) |>
  select(sections) |>
  unnest(sections) |>
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "mean", norm = "manhattan"
      )
  )

mirrors |> 
  compmus_match_pitch_template(
    key_templates,         # Change to chord_templates if descired
    method = "euclidean",  # Try different distance metrics
    norm = "manhattan"     # Try different norms
  ) |>
  ggplot(
    aes(x = start + duration / 2, width = duration, y = name, fill = d)
  ) +
  geom_tile() +
  scale_fill_viridis_c(guide = "none") +
  theme_minimal() +
  labs(x = "Time (s)", y = "")
```

### Stan by Eminem
```{r}
stan <-
  get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz") |>
  compmus_align(sections, segments) |>
  select(sections) |>
  unnest(sections) |>
  mutate(
    pitches =
      map(segments,
        compmus_summarise, pitches,
        method = "mean", norm = "manhattan"
      )
  )

stan |> 
  compmus_match_pitch_template(
    key_templates,         # Change to chord_templates if descired
    method = "euclidean",  # Try different distance metrics
    norm = "manhattan"     # Try different norms
  ) |>
  ggplot(
    aes(x = start + duration / 2, width = duration, y = name, fill = d)
  ) +
  geom_tile() +
  scale_fill_viridis_c(guide = "none") +
  theme_minimal() +
  labs(x = "Time (s)", y = "")
```

Column {data-width= 400}
----------------------

### Text about the chordograms
There is an obvious difference in these plots. And also confirms expectations. Mirrors is using a lot more notes (keys) than Stan. And there is a moment when everything comes at once in Stan which is quite obvious in the plot (around 300 s). This also shows a difference between rap and more cheerful song with higher danceability. What also leads from Stan's plot is that the song is mostly in Major notes, which was also confirmed earlier.

Tempograms/Dendrograms
==================================

Column {.tabset}
--------------------------------

### Dendrogram clustering
```{r}
out2010 <-
  get_playlist_audio_features("", "37i9dQZF1DX5Ejj0EkURtP") |>
  add_audio_analysis() |>
  mutate(
	segments = map2(segments, key, compmus_c_transpose),
	pitches =
  	map(segments,
    	compmus_summarise, pitches,
    	method = "mean", norm = "manhattan"
  	),
	timbre =
  	map(
    	segments,
    	compmus_summarise, timbre,
    	method = "mean"
  	)
  ) |>
  mutate(pitches = map(pitches, compmus_normalise, "clr")) |>
  mutate_at(vars(pitches, timbre), map, bind_rows) |>
  unnest(cols = c(pitches, timbre))

out2010_juice <-
  recipe(
	track.name ~
  	danceability +
  	energy +
  	loudness +
  	speechiness +
  	acousticness +
  	instrumentalness +
  	liveness +
  	valence +
  	tempo +
  	duration +
  	C + `C#|Db` + D + `D#|Eb` +
  	E + `F` + `F#|Gb` + G +
  	`G#|Ab` + A + `A#|Bb` + B +
  	c01 + c02 + c03 + c04 + c05 + c06 +
  	c07 + c08 + c09 + c10 + c11 + c12,
	data = out2010
  ) |>
  step_center(all_predictors()) |>
  step_scale(all_predictors()) |>
  prep(out2010 |> mutate(track.name = str_trunc(track.name, 20))) |>
  juice() |>
  column_to_rownames("track.name")

out2010_dist <- dist(out2010_juice, method = "euclidean")
out2010_dist |>
  hclust(method = "complete") |>
  dendro_data() |>
  ggdendrogram()
```

### Mirrors by Justin Timberlake
```{r}
#mirrors <- get_tidy_audio_analysis("4rHZZAmHpZrA3iH5zx8frV")
#
#mirrors |>
#  tempogram(window_size = 8, hop_size = 1, cyclic = TRUE) |>
#  ggplot(aes(x = time, y = bpm, fill = power)) +
#  geom_raster() +
#  scale_fill_viridis_c(guide = "none") +
#  labs(x = "Time (s)", y = "Tempo (BPM)") +
#  theme_classic()
```

### Stan by Eminem
```{r}
#stan <- get_tidy_audio_analysis("3UmaczJpikHgJFyBTAJVoz")
#
#stan |>
#  tempogram(window_size = 8, hop_size = 1, cyclic = FALSE) |>
#  ggplot(aes(x = time, y = bpm, fill = power)) +
#  geom_raster() +
#  scale_fill_viridis_c(guide = "none") +
#  labs(x = "Time (s)", y = "Tempo (BPM)") +
#  theme_classic()
```

Column {data-width= 400}
----------------------

### Text about the dendrogram and tempograms
Here we have the hierarchical cluster of the 2010 corpus. I used the "complete" method in the code otherwise the dendrogram was much more inefficient (it isn't best now too but I can not seem to fix that..).

I also have a problem with the tempograms of the songs, because if I load the code separately the code works fine, but when I knit the project it keeps in a loop and eventually stops knitting... I really don't know how to fix that. I do have the code in the comments in de .Rmd file.


Conclusion and Thoughts
==================================
The first thing to conclude is that it is not easy with these corpora to see very big differences.